Approximate Retrieval of XML Data with ApproXPath
نویسندگان
چکیده
Several XML query languages have been proposed that use XPath expressions to locate data. But XPath expressions might miss some data because of irregularities in the data and schema of an XML data collection. In this paper we propose ApproXPath, which supports approximate path expressions. Approximate path expressions have the same syntax as XPath expressions, but allow content and structural errors. An error is a string or tree edit operation that creates a (virtual) data collection in which the data can be located. ApproXPath extends XPath’s axes, node tests and predicates to utilize the string/tree edit distance. We show that the complexity of ApproXPath is reasonable. For many queries, the inexact matching (with no errors) is as fast as exact matching, and the cost increases linearly with the number of errors allowed.
منابع مشابه
Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica
Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...
متن کاملApproximate Tree Embedding for Querying XML Data
Querying heterogeneous collections of data-centric XML documents requires a combination of database languages and concepts used in information retrieval, in particular similarity search and ranking. In this paper we present an approach to find approximate answers to formal user queries. We reduce the problem of answering queries against XML document collections to the well-known unordered tree ...
متن کاملFuzzyXPath: Using Fuzzy Logic an IR Features to Approximately Query XML Documents
XML has become a key technology for interoperability, providing a common data model to applications. However, diverse data modeling choices may lead to heterogeneous XML structure and content. In this paper, information retrieval and database-related techniques have been jointly applied to effectively tolerate XML data diversity in the evaluation of flexible queries. Approximate structure and c...
متن کاملInformation Retrieval of Sequential Data in Heterogeneous XML Databases
The XML language is a W3C standard sustained by both the industry and the scientific community. Therefore, the available information annotated in XML keeps and will keep increasing in size. Nonetheless, not only the volume of the XML information is increasing but also its complexity. The XML documents evolved from plain structured text representations, to documents having complex and heterogene...
متن کاملCooperative XML ( CoXML ) Query Answering at INEX 03
The Extensible Markup Language (XML) is becoming the most popular format for information representation and data exchange. Much research has been investigated on providing flexible query facilities while aiming at efficient techniques to extract data from XML documents. However, most of them are focused on only the exact matching of query conditions. In this paper, we describe a cooperative XML...
متن کامل